Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 95
Filtrar
1.
JMIR Res Protoc ; 13: e50568, 2024 Mar 27.
Artículo en Inglés | MEDLINE | ID: mdl-38536234

RESUMEN

BACKGROUND: Diabetic eye screening (DES) represents a significant opportunity for the application of machine learning (ML) technologies, which may improve clinical and service outcomes. However, successful integration of ML into DES requires careful product development, evaluation, and implementation. Target product profiles (TPPs) summarize the requirements necessary for successful implementation so these can guide product development and evaluation. OBJECTIVE: This study aims to produce a TPP for an ML-automated retinal imaging analysis software (ML-ARIAS) system for use in DES in England. METHODS: This work will consist of 3 phases. Phase 1 will establish the characteristics to be addressed in the TPP. A list of candidate characteristics will be generated from the following sources: an overview of systematic reviews of diagnostic test TPPs; a systematic review of digital health TPPs; and the National Institute for Health and Care Excellence's Evidence Standards Framework for Digital Health Technologies. The list of characteristics will be refined and validated by a study advisory group (SAG) made up of representatives from key stakeholders in DES. This includes people with diabetes; health care professionals; health care managers and leaders; and regulators and policy makers. In phase 2, specifications for these characteristics will be drafted following a series of semistructured interviews with participants from these stakeholder groups. Data collected from these interviews will be analyzed using the shortlist of characteristics as a framework, after which specifications will be drafted to create a draft TPP. Following approval by the SAG, in phase 3, the draft will enter an internet-based Delphi consensus study with participants sought from the groups previously identified, as well as ML-ARIAS developers, to ensure feasibility. Participants will be invited to score characteristic and specification pairs on a scale from "definitely exclude" to "definitely include," and suggest edits. The document will be iterated between rounds based on participants' feedback. Feedback on the draft document will be sought from a group of ML-ARIAS developers before its final contents are agreed upon in an in-person consensus meeting. At this meeting, representatives from the stakeholder groups previously identified (minus ML-ARIAS developers, to avoid bias) will be presented with the Delphi results and feedback of the user group and asked to agree on the final contents by vote. RESULTS: Phase 1 was completed in November 2023. Phase 2 is underway and expected to finish in March 2024. Phase 3 is expected to be complete in July 2024. CONCLUSIONS: The multistakeholder development of a TPP for an ML-ARIAS for use in DES in England will help developers produce tools that serve the needs of patients, health care providers, and their staff. The TPP development process will also provide methods and a template to produce similar documents in other disease areas. INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID): DERR1-10.2196/50568.

2.
Proc Natl Acad Sci U S A ; 121(11): e2309576121, 2024 Mar 12.
Artículo en Inglés | MEDLINE | ID: mdl-38437559

RESUMEN

An abundance of laboratory-based experiments has described a vigilance decrement of reducing accuracy to detect targets with time on task, but there are few real-world studies, none of which have previously controlled the environment to control for bias. We describe accuracy in clinical practice for 360 experts who examined >1 million women's mammograms for signs of cancer, whilst controlling for potential biases. The vigilance decrement pattern was not observed. Instead, test accuracy improved over time, through a reduction in false alarms and an increase in speed, with no significant change in sensitivity. The multiple-decision model explains why experts miss targets in low prevalence settings through a change in decision threshold and search quit threshold and propose it should be adapted to explain these observed patterns of accuracy with time on task. What is typically thought of as standard and robust research findings in controlled laboratory settings may not directly apply to real-world environments and instead large, controlled studies in relevant environments are needed.


Asunto(s)
Neoplasias de la Mama , Femenino , Humanos , Neoplasias de la Mama/diagnóstico por imagen , Mamografía , Fatiga , Laboratorios , Proyectos de Investigación
3.
BMJ ; 384: e077039, 2024 02 01.
Artículo en Inglés | MEDLINE | ID: mdl-38302129

RESUMEN

OBJECTIVE: To explore how the number and type of breast cancers developed after screen detected atypia compare with the anticipated 11.3 cancers detected per 1000 women screened within one three year screening round in the United Kingdom. DESIGN: Observational analysis of the Sloane atypia prospective cohort in England. SETTING: Atypia diagnoses through the English NHS breast screening programme reported to the Sloane cohort study. This cohort is linked to the English Cancer Registry and the Mortality and Birth Information System for information on subsequent breast cancer and mortality. PARTICIPANTS: 3238 women diagnosed as having epithelial atypia between 1 April 2003 and 30 June 2018. MAIN OUTCOME MEASURES: Number and type of invasive breast cancers detected at one, three, and six years after atypia diagnosis by atypia type, age, and year of diagnosis. RESULTS: There was a fourfold increase in detection of atypia after the introduction of digital mammography between 2010 (n=119) and 2015 (n=502). During 19 088 person years of follow-up after atypia diagnosis (until December 2018), 141 women developed breast cancer. Cumulative incidence of cancer per 1000 women with atypia was 0.95 (95% confidence interval 0.28 to 2.69), 14.2 (10.3 to 19.1), and 45.0 (36.3 to 55.1) at one, three, and six years after atypia diagnosis, respectively. Women with atypia detected more recently have lower rates of subsequent cancers detected within three years (6.0 invasive cancers per 1000 women (95% confidence interval 3.1 to 10.9) in 2013-18 v 24.3 (13.7 to 40.1) in 2003-07, and 24.6 (14.9 to 38.3) in 2008-12). Grade, size, and nodal involvement of subsequent invasive cancers were similar to those of cancers detected in the general screening population, with equal numbers of ipsilateral and contralateral cancers. CONCLUSIONS: Many atypia could represent risk factors rather than precursors of invasive cancer requiring surgery in the short term. Women with atypia detected more recently have lower rates of subsequent cancers detected, which might be associated with changes to mammography and biopsy techniques identifying forms of atypia that are more likely to represent overdiagnosis. Annual mammography in the short term after atypia diagnosis might not be beneficial. More evidence is needed about longer term risks.


Asunto(s)
Neoplasias de la Mama , Medicina Estatal , Femenino , Humanos , Estudios de Cohortes , Estudios Prospectivos , Detección Precoz del Cáncer/métodos , Neoplasias de la Mama/diagnóstico por imagen , Neoplasias de la Mama/epidemiología , Mamografía/métodos , Inglaterra/epidemiología , Tamizaje Masivo
4.
Br J Radiol ; 97(1153): 98-112, 2024 Jan 23.
Artículo en Inglés | MEDLINE | ID: mdl-38263823

RESUMEN

OBJECTIVES: To build a data set capturing the whole breast cancer screening journey from individual breast cancer screening records to outcomes and assess data quality. METHODS: Routine screening records (invitation, attendance, test results) from all 79 English NHS breast screening centres between January 1, 1988 and March 31, 2018 were linked to cancer registry (cancer characteristics and treatment) and national mortality data. Data quality was assessed using comparability, validity, timeliness, and completeness. RESULTS: Screening records were extracted from 76/79 English breast screening centres, 3/79 were not possible due to software issues. Data linkage was successful from 1997 after introduction of a universal identifier for women (NHS number). Prior to 1997 outcome data are incomplete due to linkage issues, reducing validity. Between January 1, 1997 and March 31, 2018, a total of 11 262 730 women were offered screening of whom 9 371 973 attended at least one appointment, with 139 million person-years of follow-up (a median of 12.4 person years for each woman included) with 73 810 breast cancer deaths and 1 111 139 any-cause deaths. Comparability to reference data sets and internal validity were demonstrated. Data completeness was high for core screening variables (>99%) and main cancer outcomes (>95%). CONCLUSIONS: The ATHENA-M project has created a large high-quality and representative data set of individual women's screening trajectories and outcomes in England from 1997 to 2018, data before 1997 are lower quality. ADVANCES IN KNOWLEDGE: This is the most complete data set of English breast screening records and outcomes constructed to date, which can be used to evaluate and optimize screening.


Asunto(s)
Neoplasias de la Mama , Web Semántica , Femenino , Humanos , Medicina Estatal , Mamografía , Mama
5.
Br J Radiol ; 97(1154): 324-330, 2024 Feb 02.
Artículo en Inglés | MEDLINE | ID: mdl-38265306

RESUMEN

Evidence-based clinical guidelines are essential to maximize patient benefit and to reduce clinical uncertainty and inconsistency in clinical practice. Gaps in the evidence base can be addressed by data acquired in routine practice. At present, there is no international consensus on management of women diagnosed with atypical lesions in breast screening programmes. Here, we describe how routine NHS breast screening data collected by the Sloane atypia project was used to inform a management pathway that maximizes early detection of cancer and minimizes over-investigation of lesions with uncertain malignant potential. A half-day consensus meeting with 11 clinical experts, 1 representative from Independent Cancer Patients' Voice, 6 representatives from NHS England (NHSE) including from Commissioning, and 2 researchers was held to facilitate discussions of findings from an analysis of the Sloane atypia project. Key considerations of the expert group in terms of the management of women with screen detected atypia were: (1) frequency and purpose of follow-up; (2) communication to patients; (3) generalizability of study results; and (4) workforce challenges. The group concurred that the new evidence does not support annual surveillance mammography for women with atypia, irrespective of type of lesion, or woman's age. Continued data collection is paramount to monitor and audit the change in recommendations.


Asunto(s)
Neoplasias de la Mama , Toma de Decisiones Clínicas , Femenino , Humanos , Consenso , Incertidumbre , Mama/diagnóstico por imagen , Mama/patología , Mamografía/métodos , Neoplasias de la Mama/diagnóstico por imagen , Neoplasias de la Mama/patología
6.
Health Technol Assess ; : 1-32, 2023 Dec 14.
Artículo en Inglés | MEDLINE | ID: mdl-38140927

RESUMEN

Background: The aim of the study was to investigate the potential effect of different structural interventions for preventing cardiovascular disease. Methods: Medline and EMBASE were searched for peer-reviewed simulation-based studies of structural interventions for prevention of cardiovascular disease. We performed a systematic narrative synthesis. Results: A total of 54 studies met the inclusion criteria. Diet, nutrition, tobacco and alcohol control and other programmes are among the policy simulation models explored. Food tax and subsidies, healthy food and lifestyles policies, palm oil tax, processed meat tax, reduction in ultra-processed foods, supplementary nutrition assistance programmes, stricter food policy and subsidised community-supported agriculture were among the diet and nutrition initiatives. Initiatives to reduce tobacco and alcohol use included a smoking ban, a national tobacco control initiative and a tax on alcohol. Others included the NHS Health Check, WHO 25 × 25 and air quality management policy. Future work and limitations: There is significant heterogeneity in simulation models, making comparisons of output data impossible. While policy interventions typically include a variety of strategies, none of the models considered possible interrelationships between multiple policies or potential interactions. Research that investigates dose-response interactions between numerous modifications as well as longer-term clinical outcomes can help us better understand the potential impact of policy-level interventions. Conclusions: The reviewed studies underscore the potential of structural interventions in addressing cardiovascular diseases. Notably, interventions in areas such as diet, tobacco, and alcohol control demonstrate a prospective decrease in cardiovascular incidents. However, to realize the full potential of such interventions, there is a pressing need for models that consider the interplay and cumulative impacts of multiple policies. Rigorous research into holistic and interconnected interventions will pave the way for more effective policy strategies in the future. Study registration: The study is registered as PROSPERO CRD42019154836. Funding: This article presents independent research funded by the National Institute for Health and Care Research (NIHR) Health Technology Assessment programme as award number 17/148/05.


This study aimed to explore the potential effects of various policy changes on the prevention of heart disease. By searching two large medical databases, we identified studies that employed computer models to estimate the impact of these policies on heart disease rates. In total, 54 studies matched our criteria. These studies considered a diverse range of policy interventions. Some delved into food and nutrition, investigating aspects like unhealthy food taxes, healthy food subsidies, stricter food regulations, and nutritional assistance programs. Others examined the impact of policies targeting tobacco and alcohol, encompassing smoking bans, nationwide tobacco control measures, and alcohol taxation. Further policies assessed included routine health checkups, global health goals, and measures to enhance air quality. One significant challenge lies in the varied approaches and models each study employed, making direct comparisons difficult. Furthermore, there's a gap in understanding how these policies might influence one another, as the studies did not consider potential interactions between them. While these policies show promise in the computer models, more comprehensive research is needed to fully appreciate their combined and long-term effects on heart health in real-world scenarios. As of now, we recognize the potential of these interventions, but further studies will determine their true impact on reducing heart disease rates.

7.
J Med Imaging (Bellingham) ; 10(5): 051801, 2023 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-37915406

RESUMEN

The editorial introduces the JMI Special Section on Artificial Intelligence for Medical Imaging in Clinical Practice.

8.
Radiology ; 309(1): e222691, 2023 10.
Artículo en Inglés | MEDLINE | ID: mdl-37874241

RESUMEN

Background Despite variation in performance characteristics among radiologists, the pairing of radiologists for the double reading of screening mammograms is performed randomly. It is unknown how to optimize pairing to improve screening performance. Purpose To investigate whether radiologist performance characteristics can be used to determine the optimal set of pairs of radiologists to double read screening mammograms for improved accuracy. Materials and Methods This retrospective study was performed with reading outcomes from breast cancer screening programs in Sweden (2008-2015), England (2012-2014), and Norway (2004-2018). Cancer detection rates (CDRs) and abnormal interpretation rates (AIRs) were calculated, with AIR defined as either reader flagging an examination as abnormal. Individual readers were divided into performance categories based on their high and low CDR and AIR. The performance of individuals determined the classification of pairs. Random pair performance, for which any type of pair was equally represented, was compared with the performance of specific pairing strategies, which consisted of pairs of readers who were either opposite or similar in AIR and/or CDR. Results Based on a minimum number of examinations per reader and per pair, the final study sample consisted of 3 592 414 examinations (Sweden, n = 965 263; England, n = 837 048; Norway, n = 1 790 103). The overall AIRs and CDRs for all specific pairing strategies (Sweden AIR range, 45.5-56.9 per 1000 examinations and CDR range, 3.1-3.6 per 1000; England AIR range, 68.2-70.5 per 1000 and CDR range, 8.9-9.4 per 1000; Norway AIR range, 81.6-88.1 per 1000 and CDR range, 6.1-6.8 per 1000) were not significantly different from the random pairing strategy (Sweden AIR, 54.1 per 1000 examinations and CDR, 3.3 per 1000; England AIR, 69.3 per 1000 and CDR, 9.1 per 1000; Norway AIR, 84.1 per 1000 and CDR, 6.3 per 1000). Conclusion Pairing a set of readers based on different pairing strategies did not show a significant difference in screening performance when compared with random pairing. © RSNA, 2023.


Asunto(s)
Mamografía , Examen Físico , Humanos , Estudios Retrospectivos , Inglaterra , Radiólogos
9.
Br J Radiol ; 96(1148): 20220972, 2023 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-37399082

RESUMEN

OBJECTIVES: To review the methodology of interobserver variability studies; including current practice and quality of conducting and reporting studies. METHODS: Interobserver variability studies between January 2019 and January 2020 were included; extracted data comprised of study characteristics, populations, variability measures, key results, and conclusions. Risk of bias was assessed using the COSMIN tool for assessing reliability and measurement error. RESULTS: Seventy-nine full-text studies were included covering various imaging tests and clinical areas. The median number of patients was 47 (IQR:23-88), and observers were 4 (IQR:2-7), with sample size justified in 12 (15%) studies. Most studies used static images (n = 75, 95%), where all observers interpreted images for all patients (n = 67, 85%). Intraclass correlation coefficients (ICC) (n = 41, 52%), Kappa (κ) statistics (n = 31, 39%) and percentage agreement (n = 15, 19%) were most commonly used. Interpretation of variability estimates often did not correspond with study conclusions. The COSMIN risk of bias tool gave a very good/adequate rating for 52 studies (66%) including any studies that used variability measures listed in the tool. For studies using static images, some study design standards were not applicable and did not contribute to the overall rating. CONCLUSIONS: Interobserver variability studies have diverse study designs and methods, the impact of which requires further evaluation. Sample size for patients and observers was often small without justification. Most studies report ICC and κ values, which did not always coincide with the study conclusion. High ratings were assigned to many studies using the COSMIN risk of bias tool, with certain standards scored 'not applicable' when static images were used. ADVANCES IN KNOWLEDGE: The sample size for both patients and observers was often small without justification. For most studies, observers interpreted static images and did not evaluate the process of acquiring the imaging test, meaning it was not possible to assess many COSMIN risk of bias standards for studies with this design. Most studies reported intraclass correlation coefficient and κ statistics; study conclusions often did not correspond with results.


Asunto(s)
Diagnóstico por Imagen , Proyectos de Investigación , Humanos , Variaciones Dependientes del Observador , Reproducibilidad de los Resultados
10.
BMC Public Health ; 22(1): 2319, 2022 12 12.
Artículo en Inglés | MEDLINE | ID: mdl-36510247

RESUMEN

BACKGROUND: Screening programmes aim to identify individuals at higher risk of developing a disease or condition. While globally, there is agreement that people who attend screening should be fully informed, there is no consensus about how this should be achieved. We conducted a mixed methods study across eight different countries to understand how countries address informed choice across two screening programmes: breast cancer and fetal trisomy anomaly screening. METHODS: Fourteen senior level employees from organisations who produce and deliver decision aids to assist informed choice were interviewed, and their decision aids (n = 15) were evaluated using documentary analysis. RESULTS: We discovered that attempts to achieve informed choice via decision aids generate two key tensions (i) between improving informed choice and increasing uptake and (ii) between improving informed choice and comprehensibility of the information presented. Comprehensibility is fundamentally at tension with an aim of being fully informed. These tensions emerged in both the interviews and documentary analysis. CONCLUSION: We conclude that organisations need to decide whether their overarching aim is ensuring high levels of uptake or maximising informed choice to participate in screening programmes. Consideration must then be given to all levels of development and distribution of information produced to reflect each organisation's aim. The comprehensibility of the DA must also be considered, as this may be reduced when informed choice is prioritised.


Asunto(s)
Neoplasias de la Mama , Embarazo , Femenino , Humanos , Neoplasias de la Mama/diagnóstico , Neoplasias de la Mama/prevención & control , Diagnóstico Prenatal , Toma de Decisiones , Tamizaje Masivo/métodos
11.
Artículo en Inglés | MEDLINE | ID: mdl-36562488

RESUMEN

BACKGROUND: Cardiovascular diseases are the leading cause of morbidity and mortality worldwide. The aim of the study was to guide researchers and commissioners of cardiovascular disease preventative services towards possible cost-effective interventions by reviewing published economic analyses of interventions for the primary prevention of cardiovascular disease, conducted for or within the UK NHS. METHODS: In January 2021, electronic searches of MEDLINE and Embase were carried out to find economic evaluations of cardiovascular disease preventative services. We included fully published economic evaluations (including economic models) conducted alongside randomised controlled trials of any form of intervention that was aimed at the primary prevention of cardiovascular disease, including, but not limited to, drugs, diet, physical activity and public health. Full systematic review methods were used with predetermined inclusion/exclusion criteria, data extraction and formal quality appraisal [using the Consolidated Health Economic Evaluation Reporting Standards checklist and the framework for the quality assessment of decision analytic modelling by Philips et al. (Philips Z, Ginnelly L, Sculpher M, Claxton K, Golder S, Riemsma R, et al. Review of guidelines for good practice in decision-analytic modelling in health technology assessment. Health Technol Assess 2004;8(36)]. RESULTS: Of 4351 non-duplicate citations, eight articles met the review's inclusion criteria. The eight articles focused on health promotion (n = 3), lipid-lowering medicine (n = 4) and blood pressure-lowering medication (n = 1). The majority of the populations in each study had at least one risk factor for cardiovascular disease or were at high risk of cardiovascular disease. For the primary prevention of cardiovascular disease, all strategies were cost-effective at a threshold of £25,000 per quality-adjusted life-year, except increasing motivational interviewing in addition to other behaviour change strategies. Where the cost per quality-adjusted life-year gained was reported, interventions varied from dominant (i.e. less expensive and more effective than the comparator intervention) to £55,000 per quality-adjusted life-year gained. FUTURE WORK AND LIMITATIONS: We found few health economic analyses of interventions for primary cardiovascular disease prevention conducted within the last decade. Future economic assessments should be undertaken and presented in accordance with best practices so that future reviews may make clear recommendations to improve health policy. CONCLUSIONS: It is difficult to establish direct comparisons or draw firm conclusions because of the uncertainty and heterogeneity among studies. However, interventions conducted for or within the UK NHS were likely to be cost-effective in people at increased risk of cardiovascular disease when compared with usual care or no intervention. FUNDING: This project was funded by the National Institute for Health and Care Research (NIHR) Health Technology Assessment programme and will be published in Health Technology Assessment. See the NIHR Journals Library website for further project information.

12.
Artículo en Inglés | MEDLINE | ID: mdl-36562494

RESUMEN

BACKGROUND: As part of our ongoing systematic review of complex interventions for the primary prevention of cardiovascular diseases, we have developed and evaluated automated machine-learning classifiers for title and abstract screening. The aim was to develop a high-performing algorithm comparable to human screening. METHODS: We followed a three-phase process to develop and test an automated machine learning-based classifier for screening potential studies on interventions for primary prevention of cardiovascular disease. We labelled a total of 16,611 articles during the first phase of the project. In the second phase, we used the labelled articles to develop a machine learning-based classifier. After that, we examined the performance of the classifiers in correctly labelling the papers. We evaluated the performance of the five deep-learning models [i.e. parallel convolutional neural network ( CNN ), stacked CNN , parallel-stacked CNN , recurrent neural network ( RNN ) and CNN-RNN]. The models were evaluated using recall, precision and work saved over sampling at no less than 95% recall. RESULTS: We labelled a total of 16,611 articles, of which 676 (4.0%) were tagged as 'relevant' and 15,935 (96%) were tagged as 'irrelevant'. The recall ranged from 51.9% to 96.6%. The precision ranged from 64.6% to 99.1%. The work saved over sampling ranged from 8.9% to as high as 92.1%. The best-performing model was parallel CNN , yielding a 96.4% recall, as well as 99.1% precision, and a potential workload reduction of 89.9%. FUTURE WORK AND LIMITATIONS: We used words from the title and the abstract only. More work needs to be done to look into possible changes in performance, such as adding features such as full document text. The approach might also not be able to be used for other complex systematic reviews on different topics. CONCLUSION: Our study shows that machine learning has the potential to significantly aid the labour-intensive screening of abstracts in systematic reviews of complex interventions. Future research should concentrate on enhancing the classifier system and determining how it can be integrated into the systematic review workflow. FUNDING: This project was funded by the National Institute for Health and Care Research (NIHR) Health Technology Assessment programme and will be published in Health Technology Assessment. See the NIHR Journals Library website for further project information.

13.
Lancet Digit Health ; 4(12): e899-e905, 2022 12.
Artículo en Inglés | MEDLINE | ID: mdl-36427951

RESUMEN

Rigorous evaluation of artificial intelligence (AI) systems for image classification is essential before deployment into health-care settings, such as screening programmes, so that adoption is effective and safe. A key step in the evaluation process is the external validation of diagnostic performance using a test set of images. We conducted a rapid literature review on methods to develop test sets, published from 2012 to 2020, in English. Using thematic analysis, we mapped themes and coded the principles using the Population, Intervention, and Comparator or Reference standard, Outcome, and Study design framework. A group of screening and AI experts assessed the evidence-based principles for completeness and provided further considerations. From the final 15 principles recommended here, five affect population, one intervention, two comparator, one reference standard, and one both reference standard and comparator. Finally, four are appliable to outcome and one to study design. Principles from the literature were useful to address biases from AI; however, they did not account for screening specific biases, which we now incorporate. The principles set out here should be used to support the development and use of test sets for studies that assess the accuracy of AI within screening programmes, to ensure they are fit for purpose and minimise bias.


Asunto(s)
Inteligencia Artificial , Diagnóstico por Imagen , Tamizaje Masivo
14.
Cochrane Database Syst Rev ; 11: CD013652, 2022 11 17.
Artículo en Inglés | MEDLINE | ID: mdl-36394900

RESUMEN

BACKGROUND: The diagnostic challenges associated with the COVID-19 pandemic resulted in rapid development of diagnostic test methods for detecting SARS-CoV-2 infection. Serology tests to detect the presence of antibodies to SARS-CoV-2 enable detection of past infection and may detect cases of SARS-CoV-2 infection that were missed by earlier diagnostic tests. Understanding the diagnostic accuracy of serology tests for SARS-CoV-2 infection may enable development of effective diagnostic and management pathways, inform public health management decisions and understanding of SARS-CoV-2 epidemiology. OBJECTIVES: To assess the accuracy of antibody tests, firstly, to determine if a person presenting in the community, or in primary or secondary care has current SARS-CoV-2 infection according to time after onset of infection and, secondly, to determine if a person has previously been infected with SARS-CoV-2. Sources of heterogeneity investigated included: timing of test, test method, SARS-CoV-2 antigen used, test brand, and reference standard for non-SARS-CoV-2 cases. SEARCH METHODS: The COVID-19 Open Access Project living evidence database from the University of Bern (which includes daily updates from PubMed and Embase and preprints from medRxiv and bioRxiv) was searched on 30 September 2020. We included additional publications from the Evidence for Policy and Practice Information and Co-ordinating Centre (EPPI-Centre) 'COVID-19: Living map of the evidence' and the Norwegian Institute of Public Health 'NIPH systematic and living map on COVID-19 evidence'. We did not apply language restrictions. SELECTION CRITERIA: We included test accuracy studies of any design that evaluated commercially produced serology tests, targeting IgG, IgM, IgA alone, or in combination. Studies must have provided data for sensitivity, that could be allocated to a predefined time period after onset of symptoms, or after a positive RT-PCR test. Small studies with fewer than 25 SARS-CoV-2 infection cases were excluded. We included any reference standard to define the presence or absence of SARS-CoV-2 (including reverse transcription polymerase chain reaction tests (RT-PCR), clinical diagnostic criteria, and pre-pandemic samples). DATA COLLECTION AND ANALYSIS: We use standard screening procedures with three reviewers. Quality assessment (using the QUADAS-2 tool) and numeric study results were extracted independently by two people. Other study characteristics were extracted by one reviewer and checked by a second. We present sensitivity and specificity with 95% confidence intervals (CIs) for each test and, for meta-analysis, we fitted univariate random-effects logistic regression models for sensitivity by eligible time period and for specificity by reference standard group. Heterogeneity was investigated by including indicator variables in the random-effects logistic regression models. We tabulated results by test manufacturer and summarised results for tests that were evaluated in 200 or more samples and that met a modification of UK Medicines and Healthcare products Regulatory Agency (MHRA) target performance criteria. MAIN RESULTS: We included 178 separate studies (described in 177 study reports, with 45 as pre-prints) providing 527 test evaluations. The studies included 64,688 samples including 25,724 from people with confirmed SARS-CoV-2; most compared the accuracy of two or more assays (102/178, 57%). Participants with confirmed SARS-CoV-2 infection were most commonly hospital inpatients (78/178, 44%), and pre-pandemic samples were used by 45% (81/178) to estimate specificity. Over two-thirds of studies recruited participants based on known SARS-CoV-2 infection status (123/178, 69%). All studies were conducted prior to the introduction of SARS-CoV-2 vaccines and present data for naturally acquired antibody responses. Seventy-nine percent (141/178) of studies reported sensitivity by week after symptom onset and 66% (117/178) for convalescent phase infection. Studies evaluated enzyme-linked immunosorbent assays (ELISA) (165/527; 31%), chemiluminescent assays (CLIA) (167/527; 32%) or lateral flow assays (LFA) (188/527; 36%). Risk of bias was high because of participant selection (172, 97%); application and interpretation of the index test (35, 20%); weaknesses in the reference standard (38, 21%); and issues related to participant flow and timing (148, 82%). We judged that there were high concerns about the applicability of the evidence related to participants in 170 (96%) studies, and about the applicability of the reference standard in 162 (91%) studies. Average sensitivities for current SARS-CoV-2 infection increased by week after onset for all target antibodies. Average sensitivity for the combination of either IgG or IgM was 41.1% in week one (95% CI 38.1 to 44.2; 103 evaluations; 3881 samples, 1593 cases), 74.9% in week two (95% CI 72.4 to 77.3; 96 evaluations, 3948 samples, 2904 cases) and 88.0% by week three after onset of symptoms (95% CI 86.3 to 89.5; 103 evaluations, 2929 samples, 2571 cases). Average sensitivity during the convalescent phase of infection (up to a maximum of 100 days since onset of symptoms, where reported) was 89.8% for IgG (95% CI 88.5 to 90.9; 253 evaluations, 16,846 samples, 14,183 cases), 92.9% for IgG or IgM combined (95% CI 91.0 to 94.4; 108 evaluations, 3571 samples, 3206 cases) and 94.3% for total antibodies (95% CI 92.8 to 95.5; 58 evaluations, 7063 samples, 6652 cases). Average sensitivities for IgM alone followed a similar pattern but were of a lower test accuracy in every time slot. Average specificities were consistently high and precise, particularly for pre-pandemic samples which provide the least biased estimates of specificity (ranging from 98.6% for IgM to 99.8% for total antibodies). Subgroup analyses suggested small differences in sensitivity and specificity by test technology however heterogeneity in study results, timing of sample collection, and smaller sample numbers in some groups made comparisons difficult. For IgG, CLIAs were the most sensitive (convalescent-phase infection) and specific (pre-pandemic samples) compared to both ELISAs and LFAs (P < 0.001 for differences across test methods). The antigen(s) used (whether from the Spike-protein or nucleocapsid) appeared to have some effect on average sensitivity in the first weeks after onset but there was no clear evidence of an effect during convalescent-phase infection. Investigations of test performance by brand showed considerable variation in sensitivity between tests, and in results between studies evaluating the same test. For tests that were evaluated in 200 or more samples, the lower bound of the 95% CI for sensitivity was 90% or more for only a small number of tests (IgG, n = 5; IgG or IgM, n = 1; total antibodies, n = 4). More test brands met the MHRA minimum criteria for specificity of 98% or above (IgG, n = 16; IgG or IgM, n = 5; total antibodies, n = 7). Seven assays met the specified criteria for both sensitivity and specificity. In a low-prevalence (2%) setting, where antibody testing is used to diagnose COVID-19 in people with symptoms but who have had a negative PCR test, we would anticipate that 1 (1 to 2) case would be missed and 8 (5 to 15) would be falsely positive in 1000 people undergoing IgG or IgM testing in week three after onset of SARS-CoV-2 infection. In a seroprevalence survey, where prevalence of prior infection is 50%, we would anticipate that 51 (46 to 58) cases would be missed and 6 (5 to 7) would be falsely positive in 1000 people having IgG tests during the convalescent phase (21 to 100 days post-symptom onset or post-positive PCR) of SARS-CoV-2 infection. AUTHORS' CONCLUSIONS: Some antibody tests could be a useful diagnostic tool for those in whom molecular- or antigen-based tests have failed to detect the SARS-CoV-2 virus, including in those with ongoing symptoms of acute infection (from week three onwards) or those presenting with post-acute sequelae of COVID-19. However, antibody tests have an increasing likelihood of detecting an immune response to infection as time since onset of infection progresses and have demonstrated adequate performance for detection of prior infection for sero-epidemiological purposes. The applicability of results for detection of vaccination-induced antibodies is uncertain.


Asunto(s)
COVID-19 , SARS-CoV-2 , Humanos , COVID-19/diagnóstico , COVID-19/epidemiología , Anticuerpos Antivirales , Inmunoglobulina G , Vacunas contra la COVID-19 , Pandemias , Estudios Seroepidemiológicos , Inmunoglobulina M
15.
Soc Sci Med ; 314: 115428, 2022 12.
Artículo en Inglés | MEDLINE | ID: mdl-36272385

RESUMEN

BACKGROUND: Health economic assessments are used to determine whether the resources needed to generate net benefit from a screening programme, driven by multiple complex benefits and harms, are justifiable. We systematically identified the benefits and harms incorporated within economic assessments evaluating antenatal and newborn screening programmes. METHODS: For this systematic review and thematic analysis, we searched the published and grey literature from January 2000 to January 2021. Studies that included an economic evaluation of an antenatal or newborn screening programme in an OECD country were eligible. We identified benefits and harms using an integrative descriptive analysis, and illustrated a thematic framework. (Systematic review registration PROSPERO, CRD42020165236). FINDINGS: The searches identified 52,244 articles and reports and 336 (242 antenatal and 95 newborn) were included. Eighty-six subthemes grouped into seven themes were identified: 1) diagnosis of screened for condition, 2) life years and health status adjustments, 3) treatment, 4) long-term costs, 5) overdiagnosis, 6) pregnancy loss, and 7) spillover effects on family members. Diagnosis of screened for condition (115 studies, 47.5%), life-years and health status adjustments (90 studies, 37.2%) and treatment (88 studies, 36.4%) accounted for most of the benefits and harms evaluating antenatal screening. The same themes accounted for most of the benefits and harms included in studies assessing newborn screening. Overdiagnosis and spillover effects tended to be ignored. INTERPRETATION: Our proposed framework can be used to guide the development of future health economic assessments evaluating antenatal and newborn screening programmes, to prevent exclusion of important potential benefits and harms.


Asunto(s)
Tamizaje Neonatal , Organización para la Cooperación y el Desarrollo Económico , Recién Nacido , Femenino , Embarazo , Humanos , Análisis Costo-Beneficio , Diagnóstico Prenatal
16.
Microbiol Spectr ; 10(5): e0246822, 2022 10 26.
Artículo en Inglés | MEDLINE | ID: mdl-36135374

RESUMEN

Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) vaccine coverage remains incomplete, being only 15% in low-income countries. Rapid point-of-care tests predicting SARS-CoV-2 infection susceptibility in the unvaccinated may assist in risk management and vaccine prioritization. We conducted a prospective cohort study in 2,826 participants working in hospitals and Fire and Police services in England, UK, during the pandemic (ISRCTN5660922). Plasma taken at recruitment in June 2020 was tested using four lateral flow immunoassay (LFIA) devices and two laboratory immunoassays detecting antibodies against SARS-CoV-2 (UK Rapid Test Consortium's AbC-19 rapid test, OrientGene COVID IgG/IgM rapid test cassette, SureScreen COVID-19 rapid test cassette, and Biomerica COVID-19 IgG/IgM rapid test; Roche N and Euroimmun S laboratory assays). We monitored participants for microbiologically confirmed SARS-CoV-2 infection for 200 days. We estimated associations between test results at baseline and subsequent infection, using Poisson regression models adjusted for baseline demographic risk factors for SARS-CoV-2 exposure. Positive IgG results on each of the four LFIAs were associated with lower rates of subsequent infection with adjusted incidence rate ratios (aIRRs) of 0.00 (95% confidence interval, 0.00 to 0.01), 0.03 (0.02 to 0.05), 0.07 (0.05 to 0.10), and 0.09 (0.07 to 0.12), respectively. The protective association was strongest for AbC-19 and SureScreen. The aIRR for the laboratory Roche N antibody assay at the manufacturer-recommended threshold was similar to those of the two best performing LFIAs at 0.03 (0.01 to 0.10). Lateral flow devices measuring SARS-CoV-2 IgG predicted disease risk in unvaccinated individuals over a 200-day follow-up. The association of some LFIAs with subsequent infection was similar to laboratory immunoassays. IMPORTANCE Previous research has demonstrated an association between the detection of antibodies to SARS-CoV-2 following natural infection and protection from subsequent symptomatic SARS-CoV-2 infection. Lateral flow immunoassays (LFIAs) detecting anti-SARS-CoV-2 IgG are a cheap, readily deployed technology that has been used on a large scale in population screening programs, yet no studies have investigated whether LFIA results are associated with subsequent SARS-CoV-2 infection. In a prospective cohort study of 2,826 United Kingdom key workers, we found positivity in lateral flow test results had a strong negative association with subsequent SARS-CoV-2 infection within 200 days in an unvaccinated population. Positivity on more-specific but less-sensitive tests was associated with a markedly decreased rate of disease; protection associated with testing positive using more sensitive devices detecting lower levels of anti-SARS-CoV-2 IgG was more modest. Lateral flow tests with high specificity may have a role in estimation of SARS-CoV-2 disease risk in unvaccinated populations.


Asunto(s)
COVID-19 , Humanos , COVID-19/diagnóstico , COVID-19/epidemiología , SARS-CoV-2 , Estudios Prospectivos , Sensibilidad y Especificidad , Anticuerpos Antivirales , Inmunoensayo/métodos , Inmunoglobulina G , Inmunoglobulina M
17.
Cochrane Database Syst Rev ; 7: CD013705, 2022 07 22.
Artículo en Inglés | MEDLINE | ID: mdl-35866452

RESUMEN

BACKGROUND: Accurate rapid diagnostic tests for SARS-CoV-2 infection would be a useful tool to help manage the COVID-19 pandemic. Testing strategies that use rapid antigen tests to detect current infection have the potential to increase access to testing, speed detection of infection, and inform clinical and public health management decisions to reduce transmission. This is the second update of this review, which was first published in 2020. OBJECTIVES: To assess the diagnostic accuracy of rapid, point-of-care antigen tests for diagnosis of SARS-CoV-2 infection. We consider accuracy separately in symptomatic and asymptomatic population groups. Sources of heterogeneity investigated included setting and indication for testing, assay format, sample site, viral load, age, timing of test, and study design. SEARCH METHODS: We searched the COVID-19 Open Access Project living evidence database from the University of Bern (which includes daily updates from PubMed and Embase and preprints from medRxiv and bioRxiv) on 08 March 2021. We included independent evaluations from national reference laboratories, FIND and the Diagnostics Global Health website. We did not apply language restrictions. SELECTION CRITERIA: We included studies of people with either suspected SARS-CoV-2 infection, known SARS-CoV-2 infection or known absence of infection, or those who were being screened for infection. We included test accuracy studies of any design that evaluated commercially produced, rapid antigen tests. We included evaluations of single applications of a test (one test result reported per person) and evaluations of serial testing (repeated antigen testing over time). Reference standards for presence or absence of infection were any laboratory-based molecular test (primarily reverse transcription polymerase chain reaction (RT-PCR)) or pre-pandemic respiratory sample. DATA COLLECTION AND ANALYSIS: We used standard screening procedures with three people. Two people independently carried out quality assessment (using the QUADAS-2 tool) and extracted study results. Other study characteristics were extracted by one review author and checked by a second. We present sensitivity and specificity with 95% confidence intervals (CIs) for each test, and pooled data using the bivariate model. We investigated heterogeneity by including indicator variables in the random-effects logistic regression models. We tabulated results by test manufacturer and compliance with manufacturer instructions for use and according to symptom status. MAIN RESULTS: We included 155 study cohorts (described in 166 study reports, with 24 as preprints). The main results relate to 152 evaluations of single test applications including 100,462 unique samples (16,822 with confirmed SARS-CoV-2). Studies were mainly conducted in Europe (101/152, 66%), and evaluated 49 different commercial antigen assays. Only 23 studies compared two or more brands of test. Risk of bias was high because of participant selection (40, 26%); interpretation of the index test (6, 4%); weaknesses in the reference standard for absence of infection (119, 78%); and participant flow and timing 41 (27%). Characteristics of participants (45, 30%) and index test delivery (47, 31%) differed from the way in which and in whom the test was intended to be used. Nearly all studies (91%) used a single RT-PCR result to define presence or absence of infection. The 152 studies of single test applications reported 228 evaluations of antigen tests. Estimates of sensitivity varied considerably between studies, with consistently high specificities. Average sensitivity was higher in symptomatic (73.0%, 95% CI 69.3% to 76.4%; 109 evaluations; 50,574 samples, 11,662 cases) compared to asymptomatic participants (54.7%, 95% CI 47.7% to 61.6%; 50 evaluations; 40,956 samples, 2641 cases). Average sensitivity was higher in the first week after symptom onset (80.9%, 95% CI 76.9% to 84.4%; 30 evaluations, 2408 cases) than in the second week of symptoms (53.8%, 95% CI 48.0% to 59.6%; 40 evaluations, 1119 cases). For those who were asymptomatic at the time of testing, sensitivity was higher when an epidemiological exposure to SARS-CoV-2 was suspected (64.3%, 95% CI 54.6% to 73.0%; 16 evaluations; 7677 samples, 703 cases) compared to where COVID-19 testing was reported to be widely available to anyone on presentation for testing (49.6%, 95% CI 42.1% to 57.1%; 26 evaluations; 31,904 samples, 1758 cases). Average specificity was similarly high for symptomatic (99.1%) or asymptomatic (99.7%) participants. We observed a steady decline in summary sensitivities as measures of sample viral load decreased. Sensitivity varied between brands. When tests were used according to manufacturer instructions, average sensitivities by brand ranged from 34.3% to 91.3% in symptomatic participants (20 assays with eligible data) and from 28.6% to 77.8% for asymptomatic participants (12 assays). For symptomatic participants, summary sensitivities for seven assays were 80% or more (meeting acceptable criteria set by the World Health Organization (WHO)). The WHO acceptable performance criterion of 97% specificity was met by 17 of 20 assays when tests were used according to manufacturer instructions, 12 of which demonstrated specificities above 99%. For asymptomatic participants the sensitivities of only two assays approached but did not meet WHO acceptable performance standards in one study each; specificities for asymptomatic participants were in a similar range to those observed for symptomatic people. At 5% prevalence using summary data in symptomatic people during the first week after symptom onset, the positive predictive value (PPV) of 89% means that 1 in 10 positive results will be a false positive, and around 1 in 5 cases will be missed. At 0.5% prevalence using summary data for asymptomatic people, where testing was widely available and where epidemiological exposure to COVID-19 was suspected, resulting PPVs would be 38% to 52%, meaning that between 2 in 5 and 1 in 2 positive results will be false positives, and between 1 in 2 and 1 in 3 cases will be missed. AUTHORS' CONCLUSIONS: Antigen tests vary in sensitivity. In people with signs and symptoms of COVID-19, sensitivities are highest in the first week of illness when viral loads are higher. Assays that meet appropriate performance standards, such as those set by WHO, could replace laboratory-based RT-PCR when immediate decisions about patient care must be made, or where RT-PCR cannot be delivered in a timely manner. However, they are more suitable for use as triage to RT-PCR testing. The variable sensitivity of antigen tests means that people who test negative may still be infected. Many commercially available rapid antigen tests have not been evaluated in independent validation studies. Evidence for testing in asymptomatic cohorts has increased, however sensitivity is lower and there is a paucity of evidence for testing in different settings. Questions remain about the use of antigen test-based repeat testing strategies. Further research is needed to evaluate the effectiveness of screening programmes at reducing transmission of infection, whether mass screening or targeted approaches including schools, healthcare setting and traveller screening.


Asunto(s)
COVID-19 , COVID-19/diagnóstico , Prueba de COVID-19 , Humanos , Pandemias , Sistemas de Atención de Punto , SARS-CoV-2 , Sensibilidad y Especificidad
18.
Breast Cancer Res ; 24(1): 55, 2022 07 30.
Artículo en Inglés | MEDLINE | ID: mdl-35907862

RESUMEN

BACKGROUND: Abbreviated breast MRI (abMRI) is being introduced in breast screening trials and clinical practice, particularly for women with dense breasts. Upscaling abMRI provision requires the workforce of mammogram readers to learn to effectively interpret abMRI. The purpose of this study was to examine the diagnostic accuracy of mammogram readers to interpret abMRI after a single day of standardised small-group training and to compare diagnostic performance of mammogram readers experienced in full-protocol breast MRI (fpMRI) interpretation (Group 1) with that of those without fpMRI interpretation experience (Group 2). METHODS: Mammogram readers were recruited from six NHS Breast Screening Programme sites. Small-group hands-on workstation training was provided, with subsequent prospective, independent, blinded interpretation of an enriched dataset with known outcome. A simplified form of abMRI (first post-contrast subtracted images (FAST MRI), displayed as maximum-intensity projection (MIP) and subtracted slice stack) was used. Per-breast and per-lesion diagnostic accuracy analysis was undertaken, with comparison across groups, and double-reading simulation of a consecutive screening subset. RESULTS: 37 readers (Group 1: 17, Group 2: 20) completed the reading task of 125 scans (250 breasts) (total = 9250 reads). Overall sensitivity was 86% (95% confidence interval (CI) 84-87%; 1776/2072) and specificity 86% (95%CI 85-86%; 6140/7178). Group 1 showed significantly higher sensitivity (843/952; 89%; 95%CI 86-91%) and higher specificity (2957/3298; 90%; 95%CI 89-91%) than Group 2 (sensitivity = 83%; 95%CI 81-85% (933/1120) p < 0.0001; specificity = 82%; 95%CI 81-83% (3183/3880) p < 0.0001). Inter-reader agreement was higher for Group 1 (kappa = 0.73; 95%CI 0.68-0.79) than for Group 2 (kappa = 0.51; 95%CI 0.45-0.56). Specificity improved for Group 2, from the first 55 cases (81%) to the remaining 70 (83%) (p = 0.02) but not for Group 1 (90-89% p = 0.44), whereas sensitivity remained consistent for both Group 1 (88-89%) and Group 2 (83-84%). CONCLUSIONS: Single-day abMRI interpretation training for mammogram readers achieved an overall diagnostic performance within benchmarks published for fpMRI but was insufficient for diagnostic accuracy of mammogram readers new to breast MRI to match that of experienced fpMRI readers. Novice MRI reader performance improved during the reading task, suggesting that additional training could further narrow this performance gap.


Asunto(s)
Neoplasias de la Mama , Mama/diagnóstico por imagen , Neoplasias de la Mama/diagnóstico por imagen , Femenino , Humanos , Imagen por Resonancia Magnética/métodos , Mamografía/métodos , Estudios Prospectivos , Sensibilidad y Especificidad
19.
BMC Med Res Methodol ; 22(1): 192, 2022 07 12.
Artículo en Inglés | MEDLINE | ID: mdl-35820893

RESUMEN

BACKGROUND: Meta-analyses of test accuracy studies may provide estimates that are highly improbable in clinical practice. Tailored meta-analysis produces plausible estimates for the accuracy of a test within a specific setting by tailoring the selection of included studies compatible with a specific setting using information from the target setting. The aim of this study was to validate the tailored meta-analysis approach by comparing outcomes from tailored meta-analysis with outcomes from a setting specific test accuracy study. METHODS: A retrospective cohort study of primary care electronic health records provided setting-specific data on the test positive rate and disease prevalence. This was used to tailor the study selection from a review of faecal calprotectin testing for inflammatory bowel disease for meta-analysis using the binomial method and the Mahalanobis distance method. Tailored estimates were compared to estimates from a study of test accuracy in primary care using the same routine dataset. RESULTS: Tailoring resulted in the inclusion of 3/14 (binomial method) and 9/14 (Mahalanobis distance method) studies in meta-analysis. Sensitivity and specificity from tailored meta-analysis using the binomial method were 0.87 (95% CI 0.77 to 0.94) and 0.65 (95% CI 0.60 to 0.69) and 0.98 (95% CI 0.83 to 0.999) and 0.68 (95% CI 0.65 to 0.71), respectively using the Mahalanobis distance method. The corresponding estimates for the conventional meta-analysis were 0.94 (95% CI 0.90 to 0.97) and 0.67 (95% CI 0.57 to 0.76) and for the FC test accuracy study of primary care data 0.93 (95%CI 0.89 to 0.96) and 0.61 (95% CI 0.6 to 0.63) to detect IBD at a threshold of 50 µg/g. Although the binomial method produced a plausible estimate, the tailored estimates of sensitivity and specificity were not closer to the primary study estimates than the estimates from conventional meta-analysis including all 14 studies. CONCLUSIONS: Tailored meta-analysis does not always produce estimates of sensitivity and specificity that lie closer to the estimates derived from a primary study in the setting in question. Potentially, tailored meta-analysis may be improved using a constrained model approach and this requires further investigation.


Asunto(s)
Enfermedades Inflamatorias del Intestino , Complejo de Antígeno L1 de Leucocito , Enfermedad Crónica , Humanos , Enfermedades Inflamatorias del Intestino/diagnóstico , Enfermedades Inflamatorias del Intestino/epidemiología , Estudios Retrospectivos , Sensibilidad y Especificidad
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...